Skip to content

Conversation

@JegernOUTT
Copy link
Member

No description provided.

@JegernOUTT JegernOUTT changed the title Reviveing some functionality Reviving some functionality Nov 27, 2025
Convert HashMap to AHashMap using .into_iter().collect() to match
the expected type for tokenizers::WordPiece::builder().vocab().
This fixes compilation error E0277.
Use .into() to let the compiler convert the array into the appropriate
type (HashMap for local tokenizers 0.21.1, AHashMap for CI tokenizers 0.21.4).
This ensures compatibility across different tokenizers versions.
- Update tokenizers from 0.21.1 to 0.21.4 to match CI version
- Simplify vocab initialization using array literal without .into()
- This ensures consistent behavior across all build environments
- Normalize CRLF to LF in all parser tests before parsing
- This ensures consistent byte offsets across Windows and Unix platforms
- Fixes test failures in test_kotlin_main and test_kotlin_person on Windows
- The issue was that reference JSON files use LF-based byte offsets
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants